A Sequence-to-Sequence Framework Based on Transformer With Masked Language Model for Optical Music Recognition

نویسندگان

چکیده

Optical music recognition technology is of great significance in the development digital music. In recent years, convolutional recurrent neural network framework with connectionist temporal classification has been used recognition. However, its loss function calculated serial mode, which leads to low efficiency training and difficulty convergence. Additionally, because gradient disappearance excessive long sequences, existing models are hard learn relationships between musical symbols, resulting high sequence error rate. Therefore, we propose a sequence-to-sequence based on transformer masked language model deal these problems. The context representation symbols can be captured further by self-attention module transformer, will reduce addition, refer design mask matrix predict each symbol parallel way, so as speed up process. Our experiments carried out printed images stave dataset, results show that our proposed method training-efficient improvement accuracy

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optical Music Recognition with Convolutional Sequence-to-Sequence Models

Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-en...

متن کامل

A Framework for Exploring the Frequent Patterns based on Activities Sequence

In recent years, the development of the use of location-based tools has made it possible to produce geometric trajectories from the user's movement paths. In this way, users' goal of traveling and related activities can be considered in addition to the geometry and route shape. the user activity trajectory represents the sequence of the visited activities and its related analysis as presented i...

متن کامل

Sequence memoizer based language model for Russian speech recognition

In this paper, we propose a novel language model for Russian large vocabulary speech recognition based on sequence memoizer modeling technique. Sequence memoizer is a long span text dependency model and was initially proposed for character language modeling. Here, we use it to build word level language model (LM) in ASR. We compare its performance with recurrent neural network (RNN) LM, which a...

متن کامل

A MODEL FOR THE BASIC HELIX- LOOPHELIX MOTIF AND ITS SEQUENCE SPECIFIC RECOGNITION OF DNA

A three dimensional model of the basic Helix-Loop-Helix motif and its sequence specific recognition of DNA is described. The basic-helix I is modeled as a continuous ?-helix because no ?-helix breaking residue is found between the basic region and the first helix. When the basic region of the two peptide monomers are aligned in the successive major groove of the cognate DNA, the hydrophobi...

متن کامل

Seismic Data Forecasting: A Sequence Prediction or a Sequence Recognition Task

In this paper, we have tried to predict earthquake events in a cluster of seismic data on pacific ring of fire, using multivariate adaptive regression splines (MARS). The model is employed as either a predictor for a sequence prediction task, or a binary classifier for a sequence recognition problem, which could alternatively help to predict an event. Here, we explain that sequence prediction/r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3220878